Enabling Efficient Dynamic Resizing of Large DRAM Caches via A Hardware Consistent Hashing Mechanism

نویسندگان

Kevin Kai-Wei Chang

Gabriel H. Loh

Mithuna Thottethodi

Yasuko Eckert

Mike O'Connor

Srilatha Manne

Lisa Hsu

Lavanya Subramanian

Onur Mutlu

چکیده

Die-stacked DRAM has been proposed for use as a large, high-bandwidth, last-level cache with hundreds or thousands of megabytes of capacity. Not all workloads (or phases) can productively utilize this much cache space, however. Unfortunately, the unused (or under-used) cache continues to consume power due to leakage in the peripheral circuitry and periodic DRAM refresh. Dynamically adjusting the available DRAM cache capacity could largely eliminate this energy overhead. However, the current proposed DRAM cache organization introduces new challenges for dynamic cache resizing. The organization diUers from a conventional SRAM cache organization because it places entire cache sets and their tags within a single bank to reduce on-chip area and power overhead. Hence, resizing a DRAM cache requires remapping sets from the powered-down banks to active banks. In this paper, we propose CRUNCH (Cache Resizing Using Native Consistent Hashing), a hardware data remapping scheme inspired by consistent hashing, an algorithm originally proposed to uniformly and dynamically distribute Internet trafVc across a changing population of web servers. CRUNCH provides a load-balanced remapping of data from the powereddown banks alone to the active banks, without requiring sets from all banks to be remapped, unlike naive schemes to achieve load balancing. CRUNCH remaps only sets from the powereddown banks, so it achieves this load balancing with low bank power-up/down transition latencies. CRUNCH’s combination of good load balancing and low transition latencies provides a substrate to enable eXcient DRAM cache resizing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Addendum to “Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches”

Abstract The MICRO 2011 paper “Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches” proposed a novel die-stacked DRAM cache organization embedding the tags and data within the same physical DRAM row and then using compound access scheduling to manage the hit latency and a MissMap structure to make misses more efficient. This addendum provides a revised performan...

متن کامل

DRAM Aware Last-Level-Cache Policies for Multi-core Systems

x latency DTC in two cycles. In contrast, state-of-the-art DRAM cache always reads the tags from DRAM cache that incurs high tag lookup latencies of up to 41 cycles. In summary, high DRAM cache hit latencies, increased inter-core interference, increased inter-core cache eviction, and the large application footprint of complex applications necessitates efficient policies in order to satisfy the ...

متن کامل

Multi-Level Cache Resizing

Hardware designers are constantly looking for ways to squeeze waste out of architectures to achieve better power efficiency. Cache resizing is a technique that can remove wasteful power consumption in caches. The idea is to determine the minimum cache a program needs to run at near-peak performance, and then reconfigure the cache to implement this efficient capacity. While there has been signif...

متن کامل

C3D: Mitigating the NUMA bottleneck via coherent DRAM caches

Massive datasets prevalent in scale-out, enterprise, and high-performance computing are driving a trend toward ever-larger memory capacities per node. To satisfy the memory demands and maximize performance per unit cost, today’s commodity HPC and server nodes tend to feature multi-socket shared memory NUMA organizations. An important problem in these designs is the high latency of accessing mem...

متن کامل

MigrantStore: Leveraging Virtual Memory in DRAM-PCM Memory Architecture

With the imminent slowing down of DRAM scaling, Phase Change Memory (PCM) is emerging as a lead alternative for main memory technology. While PCM achieves low energy due to various technology-specific advantages, PCM is significantly slower than DRAM (especially for writes) and can endure far fewer writes before wearing out. Previous work has proposed to use a large, DRAM-based hardware cache t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1602.00722 شماره

صفحات -

تاریخ انتشار 2016

Enabling Efficient Dynamic Resizing of Large DRAM Caches via A Hardware Consistent Hashing Mechanism

نویسندگان

چکیده

منابع مشابه

Addendum to “Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches”

DRAM Aware Last-Level-Cache Policies for Multi-core Systems

Multi-Level Cache Resizing

C3D: Mitigating the NUMA bottleneck via coherent DRAM caches

MigrantStore: Leveraging Virtual Memory in DRAM-PCM Memory Architecture

عنوان ژورنال:

اشتراک گذاری